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Abstract 



Problems in signal processing and medical imaging often lead to calculating 
sparse solutions to under-determined linear systems. Methodologies for solving this 
problem are presented as background to the method used in this work where the 
problem is reformulated as an unconstrained convex optimization problem. The 
least squares approach is modified by an /i-regularization term. A sparse solution 
is sought using a Barzilai-Borwein type projection algorithm with an adaptive step 
length. New insight into the choice of step length is provided through a study 
of the special structure of the underlying problem. Numerical experiments are 
conducted and results given, comparing this algorithm with a number of other 
current algorithms. 

1 Introduction 

Many problems in signal processing and medical imaging can be described by the following 
linear model, 



where A e jj mx ™ ( m <n), i 6 IR m is a vector of observations, x G M n is the vector of 
unknowns and v is a noise vector usually assumed to be Gaussian. The aim is to determine 
a sparse solution x. This is an ill-posed problem because A is under-determined. In the 
over-determined case a standard approach is to solve for x in a least-squares sense by 
minimizing || Ar— However in the under-determined case least-squares regression leads 
to over-fit. Therefore a standard technique in statistical and signal processing problems is 
to incorporate a regularization term. As the solution vector x is known to be sparse, early 
work, (see for example [7]), suggest regularization with an li term (rather than Tikhonov 
(or l 2 ) regularization [15] )• This leads to the unconstrained convex optimization problem, 



b = Ax + v, 




min \\Ax — b\\\ + A||x||i. 



(2) 



X 



Here A > is a regularization parameter. The value of the scalar A is important, for 
example, if A is too large then the solution is the trivial one x = 0, (see [IE]). The 
introduction of the Zx-regularization term significantly promotes a sparse solution while 



maintaining the convexity of the objective function. 



In the next section we briefly mention some other approaches to finding sparse solutions 
to problem ( [T|) b efore focusing attention on specific implementations for solving problem 
(J2|. Section 4.5 introduces the highly successful variation on steepest descent proposed 
by Barzilai and Borwein [Ij and this is exploited in the algorithm presented in section [4] 
with numerical results presented in section |5j 



2 Previous Approaches 

Several optimization algorithms have been recently proposed with the aim of determining 
a sparse x satisfying ([!]). Some notable approaches are discussed here. 

In 2005 Candes and Romberg [5] described an algorithm to solve the problem, 

min||ar||i s.t. \\ Ax - < e 2 . (3) 

X 

The use of the Zx-norm induces sparsity in x while the constraint ensures b pa Ax. (We 
recall that b is observed in the presence of noise so it is reasonable not to enforce b = Ax ex- 
actly). The algorithm, so-called h-magic, is available online at http : //www . llmagic . org. 

More recently other groups have focused on devising algorithms for the solution of (|2]). 
A group at Stanford University [16] began their work by formulating the dual. A new 
variable z G M. m was introduced leading to the equivalent primal problem, 

min z T z + A||x||i 

x,z 

s.t. z = Ax — b. 



Dual variables z/j were associated with the equality constraints Z{ and the Lagrange dual 
problem is 

max G{y) = — \v T v — u T b 
s.t. \(A T v)i\ < A. 

The primal problem ^ satisfies Slater's condition [TB] so the optimal value f(x*) of the 
primal problem is equal to that of the dual. The duality gap was used as a stopping 
criterion for their algorithm. (For more on convex duality see for example jl]). Next ^ 
was transformed into the convex quadratic problem with linear inequality constraints: 

min \\Ax — + A Y17=i u i 

x,u 

s.t. —Ui < Xi <Ui i = 1, . . . , n. 

An interior-point truncated Newton method was used to solve this problem. The Matlab 
code for this (h-ls) algorithm is publicly available online at 
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http : //www. stanford.edu/~boyd/ll_ls/. 

A third group (Figueiredo, Nowak and Wright, [TT]) reformulated ^ as the bound con- 
strained quadratic programme, 

min l\\A(u-v)-b\\j + TZt 1 u l + TZl 1 v, (4) 

u,v 

s.t. u, v > 

where the substitution x = u — v, u,v > has been made. A projected Barzilai-Borwein 
(PBB) [8] type method was used to determine an approximate solution of Q. Matlab code 
for this (GPSR) algorithm is publicly available online at http : //www. lx . it . pt/~mtf /GPSR. 

Many other algorithms exist with applications to compressed sensing and the associated 
signal and image processing problems. For example, the SpaRSA (Sparse Reconstruction 
by Separable Approximation) algorithm [23], and the FISTA (Fast Iterative Shrinkage- 
Thresholding) algorithm [2], are two very recent algorithms which are further considered 
in section [5j Other current algorithms include: a projected Barzilai-Borwein type al- 
gorithm with applications in computed tomography |22j; the Gradient Projection, GP, 
algorithm (and the Steplength Selection for Gradient Projection, GPSS, variant) [TT]; a 
gradient descent algorithm which uses a thresholding step to encourage sparsity [14j ; and 
an algorithm for a non-convex compressed sensing problem, [6] . 

2.1 A Proposed Approach 

The Zi-magic algorithm [5] for finding a sparse solution to problem ([TJ has three levels of 
iteration (nested loops) and as a consequence, runs relatively slowly. When the problem 
is reformulated as (J2]), the algorithm in [16] uses two levels of iteration while the approach 
in [TT] uses only one level as they do not use a backtracking line search. 

The approach proposed here also aims to determine a sparse solution x using problem 
formulation A Barzilai-Borwein type algorithm with an alternating step-length, a, 
is employed. This approach (known as the Projected Alternating Barzilai-Borwein or 
PABB algorithm) is based on recent work by Dai and Fletcher [8] who have investigated 
a variant of the PABB method. They claim that this implementation performs better 
than the PBB method in practice. 

Our approach uses two levels of iteration, an outer loop defining a search direction and 
new candidate point x, and an inner backtracking line-search loop. However, the back- 
tracking line-search is included only as a safeguard, (as suggested in [S] to prevent iterates 
cycling) . This algorithm only enters the back-tracking line-search loop under certain con- 
ditions which in practice rarely arise. 

Also, as in the case of the l\-ls and GPSR algorithms, our approach only requires matrix- 
vector products involving A and A T . At each iteration there are only two matrix-vector 
products — one vector multiplication with A and one with A T - unless the inner loop 
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is required in which case there is an additional multiplication with A in the backtracking 
line search. The computational effort is therefore kept low in each iteration. 



3 A Reformulation of the Problem 



By making the substitution x = u — v, problem ^ can be recast as the bound constrained 
quadratic programme, 



min \\A(u - v) - b\\l + X J2™=i Ui + X v i 

u, v > 0. 



(5) 



s.t. 

As (§ is now a differentiable problem, the associated gradient is, 

9 



2A T (A(u -v)-b)+Xl' 
-2A T (A(u -v)-b)+Xl 



(where 1 is a vector of ones) and the associated Hessian is 

H = 2 



' A T A —A T A 
—A T A A T A 



(6) 



(7) 



At the solution of problem (|5]) we have either Ui = or v i = 0. Problems @ and @, 
although different, share a common minimizer. We prefer to solve problem ^ as the 
objective function is now differentiable. 



Another point to note, (as mentioned in [H]), is that the introduction of a shift, u <— u+A 
and v <— v + A leaves x unchanged. The gradient ^ is also independent of this shift 
although the objective function value ^ increases by 2AA. Therefore in the algorithm 
presented in section |4.5[ the value of the primal objective function is calculated using 
formula ^ rather than ^ as this gives a lower value of the objective function. 

The Lagrange dual of primal problem ^ is 

max G[v) = -\v T v -v T b (8) 

s.t. \{A T v)i\ < X 

where G(u) is the dual objective function. (This is derived in more detail in |16j . For 
more on duality see for example [12] or [E]). A dual feasible point v gives a lower 
bound on the optimal value of the primal problem and therefore an indication of the error 
in the computed solution. Furthermore, as (|2| satisfies Slater's condition, the optimal 
value of the primal problem is equal to the optimal value of the dual. Thus we define the 
duality gap to be 

rj = \\Ax - b\\ 2 + A||x||i - G{v). (9) 
This can be used as a stopping criterion which is described later. 
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4 Barzilai-Borwein Key Features 



In 1988 Barzilai and Borwein devised a novel gradient method for optimization problems, 
PQ. This Barzilai-Borwein algorithm has the unusual property that at some iterates the 
function value increases. Despite this property, the algorithm performs very well in prac- 
tice. In fact, forcing a monotonic decrease in function value at each iteration can seriously 
impair the practical performance of the algorithm, (see [S]). 

There has also been much interest in this algorithm more recently: the implementation of 
dynamical retards [18] . analysis of convergence properties [ID] and the introduction of a 
cyclic Barzilai-Borwein variant [9], (see also the review by Fletcher [13]). In the following 
subsections we introduce and discuss some of the key features of the PABB algorithm. 



4.1 Step Length and the Projection Operator 

Consider first the unconstrained case. One of the key points of the Barzilai-Borwein 
algorithm is the step length a. The quasi-Newton equation is, 

y k = Hs k , (10) 

where y k = g(x k ) — g(%k-i), s k = x k — x k _\ and H is the Hessian (H = V 2 /(x)). Suppose 
we approximate H by the matrix a~ x I where a > 0. Solving 

min \\y k - a _1 s fc || 2 

a 



gives 



s kVk 

Similarly using al to approximate H~ x and solving 

2 



o?* = ^. (11) 



mm \\ay k - s k \ 



a 



yields 



of* = 4^. (12) 



Equations (11 ) and (12) give the two step lengths used in the Barzilai-Borwein algorithm. 
In the case of the problem expressed by (|5]) we have the following result. 

Theorem 4.1. For the function pD, if A e M. mxn , (m < n), has orthonormal rows, then 
the Barzilai-Borwein step length ( fill ) satisfies 
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Proof. Let Xk = Uk — Vk so that u = Uk — Uk-i and let v 

Vk = g{xk) - g(xk-i) = 



Vk 



v k -\. Then using ([6|, 



2A T A(u - v) 
-2A T A(u-v), 



so 



vlvk 



\{u — v) T A T AA T A{u — v) 
i(u - vf 'A T A(u - v) 



as AA T = I. Using the quasi- Newton condition (10) we know that s^yk = s^Hsk where 
H is the Hessian matrix (J7|). So 



s^Hsk 



2A T A —2A T A 
—2A T A 2A T A 

2(u-v) T A T A(u-v) 



[u T v T ] 





u 




V 



Thus 



a 



BB 2 



vlvk 



2(u - v) T A T A(u - v) _ 1 
8{u - v) T A T A{u - v) ~ 4 



□ 



Now we return to the constrained optimization case. A second key feature of this Barzilai- 
Borwein variant is the projection operator. Because we have a constrained optimization 
problem (|5]), once a search direction and step have been determined the projection opera- 
tor (defined below) ensures the new candidate point x is feasible. If we define the feasible 
set of ^ to be 

Q = {x : lb < x < ub} 

where lb and ub are lower and upper bounds respectively, then the projection operator 
onto Q is 



Pn{x) = mid(lb,x,ub) 



(13) 



where mid(lb, x, ub) is the vector whose ith component is the median of the set {lbi,Xi, ubi}. 
This operator ensures any x is kept within the feasible region. 



4.2 An Adaptive Non-monotone Line-Search 

The algorithm we propose includes a backtracking line-search loop. This line-search was 
proposed by Dai and Fletcher jH] who commented, "the method again has a reference 
function value f r and each iteration must improve on the reference value. The method 
involves a small integer parameter L > 0, and f r is reduced if the method fails to improve 
on the previous best value of / in at most L iterations. We dispense with the requirement 

f(x k + pd k ) < f r + Opgldk 



6 



(where d k is the search direction, p > is a decreasing sequence of values and 9 G (0, 1)), 
to obtain a sufficient reduction in /, since in real computation any reduction is bounded 
uniformly away from zero by a small amount . . . and this is sufficient to ensure global 
convergence. We refer to this kind of line search as an adaptive non-monotone line search." 
The update strategy is clarified by the following pseudo-code where initially f r = oo, and 
fc = fbest = f(xi). 



if f{Xk)< fbest 

fbest = f(,X k ), fc = f(x k ), I = 

else 

f c = max{/ c , f(xk)}, 1 = 1 + 1 
if l = L 

fr = fc, fc = f(x k ), I = 

end 

end 



This code reduces the reference function value f r to the candidate function value f c if fb est 
has not been improved upon after L iterations. This is enough to enforce convergence 
while still allowing non-monotone behaviour. 

The choice of parameter L is important. It represents the number of iterations allowed 
before a function decrease is enforced. For example, L = 1 implies that the function value 
must be decreased at each iteration (a monotonic decrease in the objective function). As 
mentioned in section [4], forcing a decrease in the objective function can impair the prac- 
tical performance of the Barzilai-Borwein algorithm. It is suggested in [8] that suitable 
choices are L = 4 or L = 10. Initial testing showed little difference between the two 
choices, and so in the numerical results presented in section [5j L = 4 is used. 



4.3 Bounds on Allowable Step length 

In an optimization problem we would like to take the step 

x k +i = %k + ctpk (14) 



where a is the step length (given by either (11) or ( 12 )), and pk is the search direction. The 



objective function (JH| is not strictly convex, that is, has a positive semi-definite Hessian. 
The original Barzilai-Borwein convergence theory applied to strictly convex quadratic 
functions and therefore extra safeguards on a may be needed to account for zero curva- 
ture. 

In the strictly convex, quadratic, unconstrained case, the step lengths are automatically 
bounded by the reciprocal of the smallest eigenvalue of the Hessian, [20J. In the present 
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context the reciprocal of the smallest eigenvalue leads to an infinitely large step length so 
[2T] discuss the use of an upper bound a max = 10 30 as a safeguard which is implemented if 
the algorithm finds a direction of zero curvature, (or near zero curvature) at any iterate. 
However a max = 10 30 is not desirable in the current context when the solution is known 
to satisfy an a priori bound, even in the presence of zero curvature. Hence we propose an 
upper bound on all iterates Xfc clS follows. 

A more appropriate crude upper bound for the solution vector is outlined here. At the 
unique minimizer x* we have, 

f(x*)<f(x) = \\Ax-b\\l + X\\x\\ 1 

for any x. Putting x = gives, 

M\x*h< fix*) <b T b, 

so that 

and therefore 

Wl < X- (15) 

This is an a priori bound on the components of x* and hence an a priori upper bound 
on each Ui and v j. It is possible to improve this bound dynamically but numerical trials 
suggest this is not worthwhile. In any case we do not expect this bound to be active at 
the solution. 

Another result discussed below also supports the inclusion of an upper bound as a safe- 
guard against overly large step lengths. Let B = A T A. Then the Hessian matrix ^ can 
be written as the Kronecker product 



H = B® 



2 -2 
-2 2 



Using the properties of the eigenvalues of a Kronecker product (see [3]) H has 2n — m 
zero eigenvalues corresponding to 2n — m directions of zero curvature. The remaining m 
positive eigenvalues of H are given by the positive eigenvalues of 4B. In the simple case 
when A has orthonormal rows, the positive eigenvalues of B are all equal to 1. 

The dimension of the subspace of directions of zero-curvature is high. Thus there is an 
increased probability of encountering search directions for which the change in gradient 
would be tiny resulting in very large values of a for the next iteration. This supports the 



inclusion of the upper bound (15) which the solution is known to satisfy, which seems 
more appropriate than the ot max = 10 30 approach. 
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4.4 A Stopping Criterion 

An appropriate stopping criterion for any optimization algorithm is paramount to ensure 
that an accurate solution is located. A standard approach is to terminate when the norm 
of the projected gradient (see for example, [8]), is sufficiently small, indicating a stationary 
point has been found. The approach we favour is to use the duality gap as an indication 
of distance from the correct solution. So our termination criterion is 

/; < tol (16) 



G(y) 



where G(y) and 77 are defined in ^ and ^ respectively and tol is some user-defined 
tolerance. 



The stopping criterion (16) and a tolerance, tol = 10 , have been implemented in the 



BBCS algorithm outlined in this work and is used in all numerical results. 

4.5 A Barzilai-Borwein Algorithm for Compressed Sensing 

Here we propose an algorithm based upon the ideas in the previous sections which aims 
to solve problem Q. We refer to this algorithm as the Barzilai-Borwein algorithm for 
Compressed Sensing - BBCS algorithm. The BBCS algorithm is based on the algorithm 
described by Dai and Fletcher in [S] but has been tailored to problem (JS]) with tighter 
bounds on the allowable candidate vectors. 

Recall the substitution x = u — v. Let 



where u and v are defined as follows, 

■u = max(x, 0), v = min(x, 0) 
The steepest descent search direction is 

z = z k - a k „ig k (17) 
where g k is the gradient defined by (Jsl) at the point z k and a is defined by either formula 



(11) or (12). As z may now violate the constraints, the projection operator (13) is used 



to give a point zp say, which is now feasible. The projection operator uses the upper 



bound ub = b T b/\ as defined in section (4.3) and a lower bound, lb = 0. Thus the search 



direction p used in the algorithm proposed here is 

p = Zp - z k . 

Based upon the previous arguments a backtracking line search loop may be used to en- 
courage the algorithm to converge. That is, rather than forcing a monotonic decrease 
in function value at each iteration, a backtracking line search loop is used if the lowest 
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function value fb es t has not been improved upon in the previous L iterations. (Recall the 
discussion in section (4.2)). The backtracking line-search is described by 

z + = z k + (3p 

where f3 = |, |, |, . . . until f{z + ) < f r . Enter the adaptive non-monotone line search 
stage and update f r , f c and f^st according to the pseudo-code described in section (4.2). 
Finally, 

X = u — V 



is computed and the duality gap (16) is monitored to check for convergence. 



The results of this section are summarized in algorithmic form. 



Step (Initialization): Set (3 = 1/2, L = 4, function reference values f r , f c 

and fbest, lb = 0, ub = b T b/X and tol = 1CT 6 . 

Step 1: Compute the step length a and gradient g. 

Step 2: Compute z and replace it with its projection, mid(z — ag,lb,ub). 
Step 3: Compute the new search direction and Zk+i- 

Step 4: If required, perform backtracking line-search and update the reference 
values: f r , f c and f best . 

Step 5: Check whether the duality gap is sufficiently small. If so, terminate 
the algorithm, otherwise return to step 1. 



5 Numerical Results 

In this section we present numerical results obtained using the algorithm outlined in 



section |4.5| These results illustrate the performance of the BBCS algorithm and how its 
performance compares with other recent algorithms - namely, the l\-ls algorithm [16] , the 
GPSR algorithm (both the monotone and the non-monotone versions) [TT], the SpaRSA 
algorithm (both the monotone and non- monotone versions) [23], and the FISTA algorithm 



5.1 A Sparse Signal Reconstruction Problem 

The first numerical example demonstrated here is a sparse signal recovery experiment. A 
signal x G IR 4096 consisting of 160 randomly placed spikes of amplitude ±1 was generated. 
A measurement matrix A e jji024x4096 ( re p re senting 1024 observations of the signal x) 
was constructed with Gaussian J\f(0, 1) entries, and then the rows were orthonormalized 
(as for example in [5], [IE]). The observation vector b was formed according to ([TJ where v 
is drawn according to the Gaussian distribution with zero mean and variance a 2 = 10 -4 . 
The regularization parameter A was chosen to be 

A = 0.111^1100, 
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as this large A value seemed to encourage faster algorithm performance in initial numer- 
ical trials. As discussed in section 4.2 the value L = 4 was used along with a relative 
tolerance on the duality gap of 1CT 6 . The step length a was computed using formula (11 ) 
except at every fourth iteration where formula ( 12 ) is used. The initial approximation was 
x = where is a vector of zeros. Figure (5.1 ) shows the reconstruction results. The top 
plot shows the original signal. The middle plots shows the signal reconstructed using the 
BBCS algorithm. The BBCS algorithm does an excellent job finding the positions of the 
non-zero components in the signal. The bottom plot shows the minimum energy solution 
(where x = A T (AA 7 ) -1 !/) . Figure (5.1) also shows the mean squared error, MSE, for 
both signal reconstructions^} The signal reconstructed using the BBCS algorithm finds a 
solution with a low MSE indicating an accurate reconstruction. 



Original Signal, (Objective value=7.1986, Number of Spikes= 160) 



- 



BBCS Reconstruction, (Objective valiie=6.8306,MSE=0.00051086) 



Minimum Energy Signal, (Objective value=13.847,MSE=0.0291 61) 



Figure 1: Sparse signal reconstruction. From top to bottom: original signal, reconstruc- 
tion from noisy observations, minimum energy solution. 



Table (5.1 ) compares the runtimes of the Matlab implementation of our method and three 
existing methods on the problem described in subsection 5.1 The BBCS algorithm is very 
efficient for this small problem. 



5.2 Continuation 

Recent work in [11] and [23] highlighted the possibility of implementing continuation 
schemes in their proposed algorithms. Here the algorithm starts with an initial regular- 
ization parameter A which is then reduced toward some desired value and the algorithm 
is warm-started for each successive value A. 

This scheme seems to have merit - the algorithms with continuation schemes seem to find 
the solution to problem ^ faster then those without continuation schemes. However, we 
stress here that a continuation scheme could be applied to any algorithm as a means of 
improving speed. The results described here focus on the speed of the underlying algo- 
rithm (while maintaining an accurate solution). Thus we compare the proposed algorithm 

1 Here we follow [11] and define the MSE to be ^||a;t™e — x\\^, where Xtrue is the original signal. 



11 



Table 1: CPU Times (Average Over 10 Runs) on the Experiment of Figure 1. (The 
subscript m denotes the monotone version of the algorithm). 



Algorithm CPU Time (seconds) 



BBCS 


0.0770 


BBCS m 


0.0690 


SpaRSA 


0.0710 


SpaRSA m 


0.0810 


GPSR 


0.1230 


GPSR m 


0.0870 


FISTA 


0.1130 


h-ls 


0.9080 



without including a continuation scheme. 



5.3 Scalability Assessment 

An experiment proposed in [TT] and [16] aims to examine how the runtime of an algorithm 
changes as problem size grows. Their experiment is described as follows. Several random 
sparse matrices are considered whose entries are normally distributed. The dimensions of 
these matrices are O.lnxw where n ranges from 10 4 — 10 6 . The sparsity of A is controlled 
to have 3n nonzero elements. For each data set, x is also generated to be sparse with n/4 
randomly placed components of length ±1. The measurements Ax are corrupted with 
Gaussian noise of variance a 2 = 10~ 4 . For each data set the regularization parameter is 
taken as A = COlp^H^. 



An experiment based upon the above was implemented as a way of comparing the scal- 
ability of the GPSR, li-ls, FISTA, SpaRSA and BBCS algorithms. When performing 
this experiment we came across some interesting results. Figure (5.3) shows a plot of the 
original signal and the signals reconstructed by the named algorithms on a problem of 
size n = 10 4 . It is clear from this figure that the algorithms are not reconstructing the 
original signal accurately. For the problem (|2| each algorithm will always look for the 
sparsest solution and we know that we can always expect to find at least n — m zeros in 
the solution. Because the problem is formulated with more than m spikes, the algorithm 
chooses the solution vector which is sparsest and thus does not choose the original signal. 
We stress that these reconstructions are valid — the algorithm is actually finding a solu- 
tion vector x with f(x) << f(x) (where x is the original signal), so from an optimization 
perspective the algorithms are working well. The problem is that the original signal is 
not being reconstructed. Since we are interested in reconstructing the original x signal 
we have therefore decided to choose a scalability assessment based upon a non-random 
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matrix. 
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IT — Is Reconstruction, (Objective value=33.819,MSE=0.27685) 









1^ 











GPSR-Basic Reconstruction, (Objective value=33.819,MSE=0.27633) 
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SpaRSA Reconstruction, (Objective value=33.818,MSE=0.27684> 



:j rn 1 1 ]:i . 1 , i 1 1 ',i irn 



BBCS Reconstruction, (Objective value=33.818,MSE=0.27684) 



Figure 2: A sparse matrix example. From top to bottom: Original signal, l\-ls sig- 
nal reconstruction, GPSR-Basic reconstruction, SpaRSA reconstruction and the BBCS 
reconstruction. (The FISTA algorithm reconstructed x = 0). Clearly none of the algo- 
rithms tested reproduce the original signal and all algorithms find an x vector giving a 
function value of f(x) = 33.8 whereas the original signal gives a much higher function 
value of f(x) = 141.6. 

The scalability assessment proposed here also considers the computational effort required 
as problem size increases. Observation matrices, (which are sub-matrices of a DCT ma- 
trix), were constructed. The dimensions of each matrix were |n x n with n ranging from 
2 14 — 2 20 . Sparse signals with J| spikes of height ±1 were also generated for each matrix 
and the regular ization parameter was chosen to be A = 0.1||A T y||oo. 

All algorithms (BBCS, l\-ls, GPSR, SpaRSA and FISTA), were tested using this experi- 
ment set-up. For each size n, ten matrices were generated and the average CPU time for 
each algorithm was found. The signal length vs average CPU times are shown in figure 



(5.3). From this we see that the BBCS algorithm is performing very competitively with 



the other algorithms. 

The data from the scalability assessment was also used to estimate the computational 
complexity of each algorithm. That is, assume the computational cost is 0(n a ) and esti- 



mate a based upon CPU times as n increases. Table (5.3) gives the empirical estimates 



of the exponent a. As shown, there is very little difference between the empirical com- 



putational complexity of each algorithm (around 1%). Table (5.3) also shows the CPU 
time for each algorithm when n = 2 20 . The BBCS (both monotone and non-monotone 
variants) are performing very competitively with the other algorithms. 
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Scalability Assessment 




Signal Length 



Figure 3: Assessment the change in average CPU times for each algorithm as signal length 
increases. 



6 Conclusion 

The problem of finding sparse solutions to large, under-determined linear systems in the 
presence of noise is an important one in signal processing, particularly in medical imaging. 
In this paper we have discussed a number of recent approaches and have proposed a varia- 
tion of the PABB algorithm which we call the Barzilai Borwein algorithm for Compressed 
Sensing, BBCS, which provides safeguards in the case where the Hessian (|7| is positive 
semi-definite. These include the incorporation of an adaptive non-monotone line search, 
an upper bound on x as a safeguard in the presence of zero curvature and a stopping 
criterion which provides a known bound on the error in our reconstruction. 



The numerical results in Table (5.1) show that our algorithm is competitive with other 
existing algorithms. These results are encouraging because the underlying algorithm does 
not include any continuation schemes which would improve performance further. 



As the scalability experiment shows, as the magnitude of the problem is increased our 
method retains accuracy and efficiency. 

Future work includes the implementation of continuation schemes in the algorithm and 
investigating further the effects of signal reconstruction when the observation data matrix 
is sparse. 
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Table 2: Empirical Estimate of the Exponent and Average CPU time of each algorithm 
(over 10 runs). Again the subscript m denotes the monotone version of each algorithm. 



Algorithm a Value CPU time (n = 2 20 ) 



BBCS 1.082 27.32 

BBCS m 1.108 26.99 

GPSR 1.053 40.62 

GPSR m 1.073 31.35 

SpaRSA 1.075 23.29 

SpaRSA m 1.078 26.80 

FISTA 1.096 34.85 
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